VOG: Summarizing and Understanding Large Graphs
نویسندگان
چکیده
How can we succinctly describe a million-node graph with a few simple sentences? How can we measure the ‘importance’ of a set of discovered subgraphs in a large graph? These are exactly the problems we focus on. Our main ideas are to construct a ‘vocabulary’ of subgraph-types that often occur in real graphs (e.g., stars, cliques, chains), and from a set of subgraphs, find the most succinct description of a graph in terms of this vocabulary. We measure success in a wellfounded way by means of the Minimum Description Length (MDL) principle: a subgraph is included in the summary if it decreases the total description length of the graph. Our contributions are three-fold: (a) formulation: we provide a principled encoding scheme to choose vocabulary subgraphs; (b) algorithm: we develop VOG, an efficient method to minimize the description cost, and (c) applicability: we report experimental results on multi-million-edge real graphs, including Flickr and the Notre Dame web graph.
منابع مشابه
Discovery of Rare Sequential Topic Patterns in Document Stream
When and Where: Predicting Human Movements Based on Social Spatial-Temporal Events Ning Yang*, Sichuan University; Xiangnan Kong, University of Illinois at Chicago; Fengjiao Wang, University of Illinois at Chicago; Philip Yu, University of Active Multitask Learning Using Both Latent and Supervised Shared Topics Ayan Acharya*, University of Texas at Austin; Raymond Mooney, University of Texas at...
متن کاملSummarizing and understanding large graphs
How can we succinctly describe a million-node graph with a few simple sentences? Given a large graph, how can we find its most ‘important’ structures, so that we can summarize it and easily visualize it? How can we measure the ‘importance’ of a set of discovered subgraphs in a large graph? Starting with the observation that real graphs often consist of stars, bipartite cores, cliques and chains...
متن کاملSummarizing Static and Dynamic Big Graphs
Large-scale, highly-interconnected networks pervade our society and the natural world around us, including the World Wide Web, social networks, knowledge graphs, genome and scientific databases, medical and government records. The massive scale of graph data often surpasses the available computation and storage resources. Besides, users get overwhelmed by the daunting task of understanding and ...
متن کاملA quantitative analysis method for comitant exotropia using video-oculography with alternate cover
BACKGROUND The purpose of this study was to evaluate the efficacy of a quantitative analysis method for comitant exotropia using video-oculography (VOG) with alternate cover. METHODS Thirty-four subjects with comitant exotropia were included. Two independent ophthalmologists measured the angle of ocular deviation using the alternate prism cover test (APCT). The video files and data of changes...
متن کاملReducing Million-Node Graphs to a Few Structural Patterns: A Unified Approach
How do graph clustering techniques compare in terms of summarization power? How well can they summarize a million-node graph with a few representative structures? In this paper, we compare and contrast different techniques: METIS, LOUVAIN, SPECTRAL CLUSTERING, SLASHBURN, BIGCLAM, HYCOMFIT, and KCBC, our proposed k-core-based clustering method. Unlike prior work that focuses on various measures ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014